Generation of endogenously tagged E-cadherin cells using gene editing via non-homologous end joining

Summary We provide a protocol using non-homologous end joining to integrate an oligonucleotide sequence of a fluorescence protein at the CDH1 locus encoding for the epithelial glycoprotein E-cadherin. We describe steps for implementing the CRISPR-Cas9-mediated knock-in procedure by transfecting a cancer cell line with a pool of plasmids. The EGFP-tagged cells are traced by fluorescence-activated cell sorting and validated on DNA and protein levels. The protocol is flexible and can be applied in principle to any protein expressed in a cell line. For complete details on the use and execution of this protocol, please refer to Cumin et al. (2022).1


SUMMARY
We provide a protocol using non-homologous end joining to integrate an oligonucleotide sequence of a fluorescence protein at the CDH1 locus encoding for the epithelial glycoprotein E-cadherin. We describe steps for implementing the CRISPR-Cas9-mediated knock-in procedure by transfecting a cancer cell line with a pool of plasmids. The EGFP-tagged cells are traced by fluorescenceactivated cell sorting and validated on DNA and protein levels. The protocol is flexible and can be applied in principle to any protein expressed in a cell line. For complete details on the use and execution of this protocol, please refer to Cumin et al. (2022). 1

BEFORE YOU BEGIN
Gene-editing using the CRISPR-Cas9 system is nowadays widely applied in human cells. Upon Cas9induced DNA double-strand breaks, cells rely on a non-homologous end joining (NHEJ) repair pathway prone to efficiently induce insertion or deletion (indel) mutations mostly resulting in gene knockouts at the targeted genomic site. 2 In regard to gene knock-ins, homology-directed repair (HDR) is the preferred way to integrate large single-and double-stranded oligonucleotide sequences, however, it seems to be difficult to generate due to low efficiency. 3,4 In contrast to HDR generated knock-ins, NHEJ has been suggested to be more efficient for generation of knock-ins as it does not rely on a specific cell cycle and does not require regions of homology. Despite its believed error-prone character, NHEJ has been recently shown to accurately ligate DNA ends without indels. 5,6 In order to overcome limitations using HDR, here, we take advantage of NHEJ for integrating reporter genes to tag cell surface proteins. In this protocol, we provide a detailed experimental section on developing a plasmid-based knock-in in the human ovarian cancer cell line BG1 using the CRISPR-Cas9 system. This approach builds on a previously reported knock-in strategy, 3 here, further adapted to generate endogenously tagged proteins, exemplified for the transmembrane protein E-cadherin. The concept of this knock-in strategy is the combination of efficient and precise gene editing using NHEJ in human cancer cells as demonstrated in a previous study. 3 The sgRNA is designed to target the E-cadherin-encoding translation stop site as close as possible. In principle, we provide a set of plasmids (n = 4) necessary to integrate the EGFP-encoding sequence as double-stranded DNA at the genomic locus of CDH1 ( Figure 1A). The transient transfection of cancer cells results in high expression of hCas9 together with two sgRNAs controlled by the U6 promoter. The first sgRNA (sgCDH1) binds to the genomic locus while the second sgRNA (Sg-A) targets the flanking sites of the fourth plasmid encoding EGFP ( Figure 1B). The released 1. Day 1-2: Design and order PCR primers for removing the IRES from double cut NH-donor by NEBaseChanger (https://nebasechanger.neb.com/). ii. Prepare the PCR reaction as shown in Table 1, then mix well and spin down briefly.
i. Prepare the digestion reaction as shown in Table 3, mix well and spin down.
ii. Incubate at 20 C-22 C for 5 min. c. Heatshock transformation into competent cells.
ii. Add 100 mL of the DH5-alpha to the ligation reaction. Alternatives: The 23YT is a rich medium used in our laboratory, however, LB broth or Terrific Broth are also suitable for the transformation procedure with DH5-alpha. Instead of using DH5-alpha other competent cells such as Stbl3 are also possible.
3. Day 4: Pick up a colony from the plate using a pipet tip, and place it into a 14 mL round-bottom tube containing 3 mL of LB-Carb-broth (LB broth containing 100 mg/mL of carbenicillin This section provides details on how to design and clone an sgRNA that will be used for inserting the EGFP-encoding DNA sequence into the CDH1 locus. An sgRNA sequence for gene-editing of a desired genomic location is a 20 nucleotide sequence (target sequence), followed by the protospacer adjacent motif (PAM) NGG (for details also see Figure 1C) (troubleshooting 1). ii. You then need to ''Design and analyze guides'' and select a single guide, 20 bp length, the corresponding human genome, and the ''NGG'' PAM.
Note: The described settings are suitable for the current experimental setup using the Cas9 which has been derived from a human codon-optimized Streptococcus pyogenes Cas9 (hSpCas9).
e. Highlight the CDH1 exon 16 target region in the ''Sequence map'' tab. f. Next, use the ''Design CRISPR'' tab and click on the ''+'' symbol in order to obtain possible sgRNAs. We have selected the sgRNA targeting (5 0 -GAG GCG GCG AGG ACG ACT AG-3 0 ) upstream of the translation stop site of E-cadherin with an ''on-target'' and ''off-target'' score of 18.7 and 91.5, respectively.
Note: If there are other proteins that are of interest, you may also consider selecting the sgRNAs with high ''on-target'' 8 and ''off-target'' 9 scores using benchling (troubleshooting 2).  iii. Use the following Thermocycler programme for annealing: iv. Visualize annealed oligonucleotides with a 2% (w/v) agarose gel (e.g., 2 g of agarose in 100 mL of 13 TAE buffer (diluted from 503 TAE buffer with ddH 2 O) ( Figure 2A). b. Vector digestion: i. Prepare the digestion reaction of the plasmid MLM3636 using the restriction endonuclease BsmBI (Table 4). ii. Use the following programme for the digestion: iii. Run a 1% (w/v) agarose gel and cut out the band at a size of 2 0 253 bp ( Figure 2B). iv. Perform agarose gel purification using the Wizardâ SV Gel and PCR Clean-Up System kit according to the manufacturer's instructions (LINK).
Alternatives: Apart from the above-mentioned kit for agarose gel purification other equivalent kits or protocols can be used (e.g. Zymocleanä Gel DNA Recovery Kit, ZYMO RESEARCH).
Pause point: The annealed oligonucleotides as well as the digested and purified plasmid can be stored at À20 C for long-term storage.
c. Ligation of annealed oligonucleotides with purified digested vector: a. Prepare the mastermix for a colony PCR (Table 6) using human U6 as forward and the reverse sgRNA as reverse primer: b. Pick up a colony with a pipet tip, dip it into the PCR tube containing the prepared mastermix and then place the tip into a liquid overnight culture (14 mL round-bottom tube containing 3 mL of LB-Carb-broth). i. Repeat this step for $10 colonies. c. The colony PCR is performed with the following parameters (Table 7): d. Visualize the PCR result with a 1.5% (w/v) agarose gel ( Figure 3). e. The overnight cultures of the positive colonies are incubated at 37 C shaking at 200 rpm for 16 h overnight. 5. Day 4: Plasmid preparation, Sanger DNA sequencing and glycerol stocks.
a. Perform a plasmid preparation using the ZR Plasmid Miniprep -Classic Kit according to the manufacturer's protocol (LINK). For this purpose, use 2 mL of the overnight culture. b. Measure the concentration and purity and prepare the plasmid DNA for Sanger DNA sequencing as described above. i. In our case, selected clones were shipped along with 10 mM of human_U6_F primer in a separate tube. c. In parallel, prepare a glycerol stock as described above. 6. Day 5: Analyze the DNA sequencing results. a. We usually use 4Peaks for visualization of .ab1 sequencing files ( Figure 4). b. If the DNA sequence confirms successful insertion of the desired oligonucleotide, either prepare more plasmid (mini-or midiprep) or directly continue with transient transfection of the desired cell line (BG1) as described below. This construct is hereinafter referred to as ''sgCDH1''.
Pause point: The plasmid DNA can be stored at À20 C for long-term storage. Optional: If a higher quantity of plasmid DNA is needed, start a new overnight culture for an enlarged plasmid purification, i.e. inoculate 150 mL of LB-Carb-broth with 100 mL of the existing culture and incubate at 37 C, 200 rpm, 16 h overnight. On the next day, perform a plasmid preparation using the NucleoBondâ Xtra Midi Plus kit according to the manufacturer's protocol (LINK).
Pause point: The plasmid DNA can be stored at À20 C for long-term storage. Glycerol stocks should be kept at À80 C for long-term storage.
CRITICAL: It is suggested to obtain high concentration ($500 ng/mL) and purity of all plasmids in order to increase the efficiency of transient transfections. Autoclave the solution and store at 20 C-22 C. To prepare the plates, heat it up in a microwave and add the appropriate antibiotics. Store the plates at 4 C for up to 1 month.

KEY RESOURCES TABLE
LB Broth: add 10 g of LB Broth (Lennox) in 500 mL of ddH 2 O.
Autoclave the solution and store at 20 C-22 C. As soon as antibiotics have been added store at 4 C for up to 1 month.
23YT Broth: add 15.5 g of 23YT Broth in 500 mL of ddH 2 O.
Autoclave the solution and store at 20 C-22 C for up to 3 months.
Bromophenol Blue 10% solution: add 1 g of Bromophenol Blue in 10 mL of ddH 2 O.
Store at 20 C-22 C for up to 12 months.
Store at -20 C for up to 12 months.
10% SDS: add 1 g of SDS in 10 mL of ddH 2 O.
Store at 20 C-22 C for up to 12 months.  Store at 4 C for up to 12 months.
Store at 4 C for up to 5 weeks.
Store at 4 C for up to 5 weeks.
Permeabilization buffer: Prepare a solution of PBS containing 0.25% Triton-X100. Store at 20 C-22 C for up to 3 months. Washing buffer: Prepare a solution of PBS containing 0.1% Tween 20.
Store at 20 C-22 C for up to 3 months.  The following describes the procedure to obtain a cancer cell line with functional E-cadherin knockin. For representative purposes, we have utilized the human ovarian cancer cell line BG1 previously described to express sufficient amounts of E-cadherin. 1 We describe the required cell culture conditions, cell line transfection using appropriate controls, and evaluation of knock-in cells using either fluorescence-activated cell sorting (FACS) or flow cytometry for EGFP + cell sorting or analysis, respectively. A schematic flowchart for the knock-in procedure is shown in Figure 5.
1. Seeding of the BG1 ovarian cancer cell line.  i. Prepare a 12-well plate by adding 1 mL of culture media containing 150 0 000 cells per well.
ii. Gently homogeneously distribute the cells in each well by manual horizontal shaking. 2. One day after cell seeding, prepare the plasmid combinations and perform the transfection.
a. Prepare the plasmid pools and appropriate controls in 1.5 mL tubes as suggested in Table 8. b. Transient transfection of cancer cells. i. Exchange the culture medium of the cells at least 1-3 h before transfection using 1 mL of culture medium. ii. Add 100 mL of serum-free RPMI to each plasmid pool and vortex gently. iii. Pre-warm the ViaFect reagent to 20 C-22 C and mix by inverting it. iv. Add 4 mL of ViaFect to each plasmid pool, vortex gently, and spin down briefly. v. Incubate the reactions at 20 C-22 C for 20 min. vi. Add the reactions dropwise to the cells in the 12-well plate. vii. Slightly rock the plate to distribute the DNA-ViaFect, then incubate the cells at 37 C and 5% CO 2 until the next day (troubleshooting 3).
CRITICAL: Be sure to use an early passage of your cell line of interest as the entire process of the knock-in generation will include multiple passaging steps.
A high concentration of plasmids is of benefit but for the transfection, they should be diluted allowing to pipet at least 1 mL to increase accuracy.
It is important to add appropriate controls to the transfection setup in order to know the overall efficiency of the transfection reagent in combination with the cell line of interest (e.g., pmaxGFP or pEGFP-N1) as well as the successful cleavage by the Cas9 (by comparing to the mutant hCas9_D10A) and, finally, to assess the signal generated by double-cut EGFP only alone. A nontransfected control should also be included.
The plasmid amounts provided in the table above are specifically for a 12-well format. Consequently, the amounts must be adapted if the format is changed but we suggest keeping the ratio of the plasmids.

Major step two: Flow cytometry analysis and FACS enrichment of EGFP + cells
Timing: 4-7 weeks (for steps 3 to 5) After transfection, incubate the cells for 48 h or until they reach $80% confluency and perform analysis by flow cytometry in order to determine the percentage of EGFP + cells together with the transfection efficiency (control plasmid). The percentage of the knock-in population might be very low at this stage. After keeping them in culture for up to two weeks, enrich the transfected cells for the EGFP + population by fluorescence-activated cell sorting (FACS). Keep the EGFP + cells under appropriate culture conditions until a sufficient cell number is reached for re-analysis. In order to achieve >80% EGFP + cells, at least 2 to 3 FACS sorts are necessary. The procedure is explained in detail below.

Preparing cells for flow cytometry analysis.
CRITICAL: This should be done earliest 48 hours after transfection or when the cells reach $80% confluence. This initial analysis is crucial to determine the transfection efficiency, i.e. the percentage of EGFP + cells in the 'transfection efficiency' control.
a. Remove the culture medium and wash the cells by adding 300 mL of 13 PBS per well. b. Remove the PBS and add 300 mL of 13 Trypsin-EDTA. c. Incubate cells at 37 C and 5% CO 2 until they completely detach from the plate (up to 15 min in the case of BG1). d. Add 600 mL of culture medium to stop the Trypsin reaction. e. Gently pipet the cells up and down to detach all remaining cells and to dissolve clumps. f. Place 300 mL of the suspensions into 1.5 mL Eppendorf tubes (for flow cytometry analysis) and plate the remaining 600 mL into a new 6-well plate.
Note: The aim is to expand the knock-in cells (not the controls) at least to a T25 or a T75 flask for FACS (described in the next step). The controls can be kept for additional 2-3 passages and re-analyzed by flow cytometry at each passage to determine whether the GFP signal is decreasing.
g. Centrifuge the Eppendorf tubes for flow cytometry analysis at 400 3 g for 5 min at 20 C-22 C and resuspend in 120 mL of flow cytometry buffer (13 PBS containing 1% of FBS). h. Transfer the cell suspensions to a 96-well plate and proceed to flow cytometry analysis, in our case by CytoFLEX (Beckman Coulter, USA) ( Figure 6A).

Preparing cells for FACS.
Note: The cells should have reached 80% confluence of a T25 or a T75 flask. The following protocol considers a T75 flask.
a. Remove the culture medium and wash the cells by adding 3 mL of 13 PBS. b. Remove the PBS and add 3 mL of 13 Trypsin-EDTA and incubate at 37 C and 5% CO 2 until completely detached from the flask.   ll OPEN ACCESS f. Prepare a second tube containing culture media which will be used to collect the sorted cells.
Note: The collection tube size should be adapted to the number of cells expected to be positive.
g. Proceed to the sorting. h. Centrifuge the sorted cells at 400 3 g for 5 min at 20 C-22 C, remove the supernatant very carefully and leave $300-500 mL of liquid in the tube. i. Resuspend the cells and plate them into a well. j. Add an appropriate volume of fresh culture medium on top (troubleshooting 4 and 5).
Note: The well size should correspond to the number of collected cells, e.g. around 1000 cells can be plated in a 48-or 24-well plate depending on the doubling time of the cell line (in our case, BG1 is a fast-growing cell line).

Further procedure:
a. Exchange the medium one day after sorting. b. Expand the cell culture until reaching a sufficient number of cells (25cm 2 culture flask with $3 3 10 6 live cells) for flow cytometry analysis to determine the EGFP + enriched population and repeat step 3 (Preparing cells for flow cytometry analysis) ( Figure 6B). c. Repeat step 4 (Preparing cells for FACS) if another enrichment is necessary. These enrichment cycles should be repeated until the desired percentage of EGFP + cells is reached ( Figure 6B). d. As soon as the cells have been sufficiently enriched, expand them and either proceed to your individual experiments or freeze them using freezing medium (FBS containing 10% DMSO, filtered 0.2 mm).
Pause point: Cells can be frozen at this point and are stable at À80 C for up to 12 months or several years when stored in liquid nitrogen.
CRITICAL: It is crucial to choose an appropriate well size to plate the cells after sorting. The cells will not grow well or take more time to reach full confluence if the chosen well size is too large. The doubling time of the cell line should also be considered.

Major step three: Validation of the CDH1 knock-in cells by PCR, Western blot, and immunofluorescence
Timing: 1 week (this is flexible as not everything has to be performed at the same time) (for steps 6 to 8) The successful integration of the EGFP-encoding DNA sequence at the C-terminus of E-cadherin must be further validated, in addition to the confirmation by flow cytometry analysis. Here, we address genomic DNA and the expression of the tagged protein. f. Dilute the genomic DNA to 100 ng/mL. g. Prepare the PCR reaction master mix following Table 9 below. Please note that the volumes given are for one reaction. These should be multiplied by the total amount of reactions needed. In our case, we prepared the master mix for four reactions (BG1 wildtype, BG1 CDH1-EGFP knock-in, water control, and one additional reaction to compensate for pipetting errors). In addition, three independent master mixes should be prepared, one for each primer pair: CDH1_PCR_F1 / CDH1_PCR_R1, CDH1_PCR_F1 / EGFP-N_R, EGFP-C_F / CDH1_ PCR_R1. Prepare the master mix in 1.5 mL tubes without adding the DNA. Pipet 9 mL of the master mix into four PCR tubes and add the respective DNA or ddH 2 O to each tube. Mix the tubes gently and spin them down briefly. h. Place the PCR tubes into the thermocycler and apply the following programme (Table 10): i. In the meantime, prepare a 1.2% (w/v) agarose gel. j. When the PCR reaction is finished, add 2 mL of 63 Loading dye to each sample, mix and spin down briefly. k. Load everything onto the agarose gel and let it run at 120 V for $20-30 min or until a sufficient separation has been achieved. l. Take an image of the gel using a gel doc system ( Figure 6C).
CRITICAL: It is important to start the genomic DNA extraction with a sufficient number of cells in order to obtain high-quality DNA with a sufficient yield.
It is also important to adapt the PCR master mix if you are using a different polymerase and to change the PCR conditions and percentage of agarose gel depending on the expected size of your amplicon. If the wildtype and knock-in bands only show a small difference in size, you should also increase the time of gel electrophoresis for better separation.  Note: From now on, always keep the protein lysates on ice.
Pause point: The protein lysates can be stored at -80 C for long-term storage. Thaw on ice for further usage.
ix. Measure the protein concentration with Pierceä BCA Protein Assay Kit according to the manufacturer's protocol (LINK). x. Dilute the protein lysates to a concentration of 2 mg/mL using ddH 2 O and add 43 gel loading buffer (incl. DTT: add 100 mL of 2M DTT solution to 400 mL of 43 gel loading buffer). xi. Boil at 95 C for 5 min to denature the proteins. From this point on, the samples can be handled at 20 C-22 C. The boiled lysates can be loaded onto an acrylamide gel or stored at À20 C.
Pause point: The boiled protein lysates can be stored at À20 C for long-term storage. After thawing, heat them up for 2 min at 60 C. They are now ready to use.

b. Preparation of 10% acrylamide/bis gels for Western blot.
Note: This should be done inside a chemical hood.
i. Clean the glass plates (Bio-Rad, thickness 1.5 mm) with ddH 2 O, then with 70% ethanol and assemble them according to the manufacturer. ii. Prepare the resolving gel in a 50 mL tube combining the following reagents in the given order (consider 10 mL per gel) in Table 11: Note: As soon as the TEMED has been added, pour the gel immediately as the polymerization will begin.
iii. Pour the gel between the assembled glass plates leaving enough space for the stacking gel ($1.5 cm from the top). v. Prepare the stacking gel in a 50 mL tube combining the following reagents in the given order (consider 3 mL per gel) in Table 12: Note: As soon as the TEMED has been added, pour the gel immediately as the polymerization will begin.
vi. Pour off the layer of isopropanol or ddH 2 O and pour the stacking gel by filling the glass plates completely to the top. vii. Immediately add the comb (in our case with 10 wells) which must be suitable for 1.5 mm glass plates. viii. Leave the gel to solidify for $15 min.
Pause point: The gels can now be wrapped in moist tissue paper and kept in a sealed plastic bag at 4 C for up to 1 week.
c. Assembly and SDS-PAGE.
i. Prepare 13 Running Buffer: Dilute 103 Running Buffer by adding 100 mL-900 mL of ddH 2 O. ii. Remove the glass plates with the gels from the Bio-Rad system, remove the combs carefully and rinse the glass plates and the wells with ddH 2 O. iii. Assemble the glass plates with the gels inside the electrophoresis apparatus and fill up the middle with 13 Running Buffer until reaching the top of the glass plates. Fill up the outside of the glass plates as far as indicated on the chamber. iv. Load 5 mL of the ladder and the desired amount of the protein lysates into the wells. v. Close the electrophoresis apparatus and apply 90 V until the samples reach the resolving gel, then increase to 120 V until a good separation is achieved. This can be verified by following the separation of the ladder.
Note: At some point, the smallest proteins will leak out of the bottom of the gel and be lost, so stop it in time.
d. Transfer onto PVDF membrane. i. Prepare 13 Transfer buffer as described in the following table and place it at 4 C. It can be reused up to 3 times. ii. Soak a PVDF membrane in methanol inside a glass petri dish for 30 s, then transfer it to ddH 2 O. iii. Soak sponges and Whatman paper in a container with 13 Transfer buffer. iv. Remove the glass plates from the electrophoresis apparatus, separate the plates and place the gel into the container with 13 Transfer buffer. v. Assemble the sandwich consisting of a sponge, Whatman paper, gel, PVDF membrane, Whatman paper, and sponge and place it inside the transfer apparatus. vi. Fill the chamber with 13 Transfer buffer and add a container with ice. vii. Apply 75 V for 1.5 h in total. Exchange the ice cuvette after 45 min. i. Prepare the primary antibody solutions in 50 mL tubes by taking 5 mL of 13 TBST with 3% BSA and adding the desired primary antibody following the manufacturer's recommended dilution for Western blot. ii. Prepare a clean surface and place the PVDF membrane on top. iii. Cut it at the desired size (use the ladder bands to know where to cut). iv. Place the membrane pieces in the corresponding primary antibody containing 50 mL tubes. v. Incubate at 4 C for 16 h overnight on a roller shaker. vi. Remove the membrane pieces from the tubes and place them in individual boxes.
Note: The tubes containing the primary antibody solution can be frozen at À20 C and reused several times.
vii. Add 5 mL of 13 TBST to each membrane piece and incubate for 10 min with gentle agitation at 20 C-22 C. Then discard the solution. Repeat this washing step twice more. g. Secondary antibodies.
i. Prepare the secondary antibody solution by diluting the antibody 1:10 0 000 in 13 TBST (e.g., 10 mL containing 1 mL of secondary antibody). ii. Add 5-10 mL of the secondary antibody solution to each membrane piece and incubate for 3 h at 20 C-22 C with gentle agitation. Then discard the secondary antibody solution (do not reuse). iii. Add 5 mL of 13 TBST to each membrane piece and incubate for 10 min with gentle agitation at 20 C-22 C. Then discard the solution. Repeat this step twice more. h. Development.
i. Place the membrane on a clean surface, e.g., a piece of plastic foil which has been cleaned with 70% ethanol and drain excess liquid with a tissue (avoid touching the surface of the membrane with the tissue). ii. Mix equal amounts of SuperSignalä West Dura Luminol/Enhancer and SuperSignalä West Dura Stable Peroxide from the kit SuperSignalä West Dura Extended Duration Substrate and distribute dropwise onto the membrane until the whole surface is covered.

OPEN ACCESS
viii. Prepare the primary antibody solution by adding the primary antibody to the antibody dilution buffer in an appropriate dilution for immunofluorescence suggested by the manufacturer. ix. Remove the blocking buffer and add 250 mL of the primary antibody solution and incubate 16 h overnight at 4 C. c. Staining and mounting.
i. Remove the primary antibody solution and add 500 mL of washing buffer. Remove the washing buffer and repeat this washing step twice more. ii. Prepare the secondary antibody solution by diluting the secondary antibody 1:500 in antibody dilution buffer and add 250 mL of the secondary antibody solution to the cells. iii. Incubate at least for 3 h in the dark at 20 C-22 C. iv. Remove the secondary antibody solution and carefully remove the upper part of the slide (starting from one edge). Try to do this in a slow and steady movement to avoid breaking the cell layer. v. Wash the slide by immersion into 13 PBS in a light-protected container. vi. Remove any excess liquid from the slide by gently tapping the edge of the slide onto tissue paper. vii. Add 150-200 mL of ProLong Gold mounting media (containing DAPI) in the middle of the slide. viii. Carefully cover the cells with a 24 mm 3 60 mm coverslip.
Note: Avoid trapping any air bubbles under the coverslip as later the imaging will not be possible at this spot.
ix. Seal the edges of the coverslip with transparent nail polish and allow the mounting medium to cure for 5 min in the dark.
Pause point: The slide can be stored at 4 C in the dark.
x. Allow the slide to cure for 1 day at 4 C in the dark. xi. Proceed to a (confocal) fluorescence microscope and perform the analysis. In our case, we used a spinning-disk Nikon CSU W1 confocal microscope (CFI Plan Apo Lambda 403 Air NA0.95) and images were stored in the secure Open Microscopy Environment OMERO for reproducible and robust image analysis (University of Basel, Imaging core facility) ( Figure 6F).

EXPECTED OUTCOMES
The current protocol allows us to C-terminally tag the transmembrane glycoprotein E-cadherin in the ovarian cancer cell line BG1 using the CRISPR-Cas9 system and NHEJ. A set of validation methods confirms successful integration of the EGFP encoding sequence ( Figure 6). The tagged protein can theoretically be used for downstream assays studying factors impacting CDH1 protein expression, to immuno precipitate the protein via the EGFP tag to further characterize the protein in regards to post-translational modifications or potential binding partners, or to trace endogenous E-cadherin expressing cells in vitro and in vivo. In principle, the method described herein allows us to tag any protein of interest by designing a new sgRNA targeting a desired genomic protein-encoding locus.
In the meantime, we have applied this experimental procedure to breast cancer (MCF-7 and T47D) and leukemia (HEL) cell lines tagging either E-cadherin or CD117 encoded by KIT, respectively (data not shown).

LIMITATIONS
Generation of knock-in cells using fluorescence proteins such as the herein-described EGFP highly relies on the targeted cell lines. Here, the gene editing efficiency also depends on the transfection ll OPEN ACCESS efficiency obtained with the commercially available transfection reagent. Another important aspect is the endogenous expression of the protein of interest. Low expression may result in insufficient fluorescence intensity for downstream enrichment and analysis. Thus, the protein expression should be determined in advance.
Another important factor is the genomic locus for insertion which can obviously cause frameshift mutations or the integration of the EGFP-encoding DNA sequence may interfere with the function of the protein as highlighted in Cumin et al. 2022. 1 In case FACS enrichment is not a valid option, it is plausible to consider a selection of individual cell clones harboring the knock-in of interest. We believe that our described approach enriching for EGFP + cells reduces the potential pitfall of clonal effects considering the heterogeneity of cancer cell lines. This is supported by single-cell transcriptomics of cancer cell lines which revealed that clinically relevant markers were maintained, however, authors show that breast cancer cell lines express these markers heterogeneously among cells even within the same cell line. Moreover, they observed dynamic plasticity in the regulation of HER2 expression in the MDA-MB-361 cell line with striking consequences on drug response. 10 This reported finding may be an issue while working with selected clones. Apart from cancer cell line heterogeneity, avoiding clonal selection also massively reduces the costs considering that each individual clone derived from a parental knock-in culture requires characterization which is mandatory before further use. In addition, after delivering the plasmid set, single cells have to be isolated to generate clonal cells that can be verified as valid knock-ins. Here, limited dilution cloning and FACS are common approaches, but none is ideal in terms of efficacy, control of the selection and cell viability. Limiting dilution cloning is economical and may cause less cellular stress than FACS sorting, but it is severely limited in generating highly validated single-cell clones. Furthermore, the clonal selection is also introducing affinity-based selection steps 11 which downstream may not represent the parental cell characteristics.

Problem 1
Potential re-cutting of Cas9 at the targeted genomic locus (related to Preparation 2).

Potential solution
It is important to mention that our experimental setup in principle utilizes any sgRNA sequence together with the described plasmid set which we consider as an advantage over traditional CRISPR-Cas9 HDR strategies. However, our described experimental setup is limited to available sgRNA recognition sites at the desired genomic locus. Our method also does not allow mutation of the guide recognition site to minimize re-cutting by Cas9 endonuclease after successful gene-editing. We aim to overcome this potential limitation by transiently delivering our set of plasmids including the Cas9 encoding vector.

Problem 2
Low in silico scores predicted by benchling (related to Preparation 2 / Step 6).

Potential solution
In principle, benchling suggests to select sgRNA guides with high on-target and off-target scores. Our protocol relies on the genomic locus guide recognition site for in-frame insertion of the fluorescence protein. Here, a low on-target score may explain low percentage of EGFP + cells. We considered the in silico-predicted on-and off-target score with less priority.

Problem 3
Transfection of the human cells leads to increased cell death resulting in many floating/detached cells on the next day (related to Step 2).

Potential solution
Before starting the actual experiment, it is recommended to test different transfection reagents with the cell line of interest using a control plasmid. This will provide information on which reagent is the most suitable resulting in viable cells and high transfection efficiency. Apart from lipofection, electroporation would be an alternative to enhance transfection efficiency.

Problem 4
In general, FACS enrichment results in a low number of EGFP + cells. This in turn may lead to insufficient growth (related to Step 4).

Potential solution
Due to expected low knock-in efficiency resulting in a low number of EGFP + cells, make sure to add enough culture medium in the FACS collection tube as cells might be lost if they stick to the wall of the tube.
Additionally, plate the cells in a smaller well size enabling closer proximity of the cells. This also depends on the cell line of interest which might be more or less capable of tolerating the stress of sorting and of being plated at low density.

Problem 5
Low or non-detectable EGFP + cells due to out-of-frame insertion (related to Step 4).

Potential solution
Expression of E-cadherin C-terminally tagged with an out-of-frame fluorescence protein may show different molecular weight using Western blot analysis. Independently consider another sgRNA recognition site for insertion of the fluorescence protein.

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact [Francis Jacob] (francis.jacob@unibas.ch).

Materials availability
All materials used in this experiment are available through commercial resources and addgene as highlighted in the key resources table with the exception of the modified double-cut EGFP only derived from the double cut NH-donor. The plasmid can be shared upon request to the lead contact, Francis Jacob (francis.jacob@unibas.ch).

Data and code availability
To the best of our knowledge, we have provided all information necessary. If any further information is required to perform the experiment described in this work, the lead contact welcomes requests.